Overview

Dataset statistics

Number of variables20
Number of observations428
Missing cells88
Missing cells (%)1.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory67.0 KiB
Average record size in memory160.3 B

Variable types

NUM11
BOOL8
CAT1

Warnings

vehicle_name has a high cardinality: 425 distinct values High cardinality
dealer_cost is highly correlated with retail_priceHigh correlation
retail_price is highly correlated with dealer_costHigh correlation
hwy_mpg is highly correlated with city_mpgHigh correlation
city_mpg is highly correlated with hwy_mpgHigh correlation
city_mpg has 15 (3.5%) missing values Missing
hwy_mpg has 15 (3.5%) missing values Missing
len has 26 (6.1%) missing values Missing
width has 28 (6.5%) missing values Missing
vehicle_name is uniformly distributed Uniform

Reproduction

Analysis started2020-12-04 12:51:15.930735
Analysis finished2020-12-04 12:51:34.766263
Duration18.84 seconds
Software versionpandas-profiling v2.9.0
Download configurationconfig.yaml

Variables

vehicle_name
Categorical

HIGH CARDINALITY
UNIFORM

Distinct425
Distinct (%)99.3%
Missing0
Missing (%)0.0%
Memory size3.3 KiB
Mercedes-Benz C320 4dr
 
2
Infiniti G35 4dr
 
2
Mercedes-Benz C240 4dr
 
2
Ford F-150 Regular Cab XL
 
1
Kia Amanti 4dr
 
1
Other values (420)
420 
ValueCountFrequency (%) 
Mercedes-Benz C320 4dr20.5%
 
Infiniti G35 4dr20.5%
 
Mercedes-Benz C240 4dr20.5%
 
Ford F-150 Regular Cab XL10.2%
 
Kia Amanti 4dr10.2%
 
Honda Pilot LX10.2%
 
Mazda B4000 SE Cab Plus10.2%
 
Oldsmobile Silhouette GL10.2%
 
Toyota Corolla S 4dr10.2%
 
Saab 9-5 Aero10.2%
 
Other values (415)41597.0%
 
2020-12-04T12:51:34.926799image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Frequencies of value counts

Unique

Unique422 ?
Unique (%)98.6%
2020-12-04T12:51:35.108340image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length45
Median length21
Mean length21.94392523
Min length8
Distinct2
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size3.3 KiB
1
244 
0
184 
ValueCountFrequency (%) 
124457.0%
 
018443.0%
 
2020-12-04T12:51:35.217051image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

sports_car
Boolean

Distinct2
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size3.3 KiB
0
379 
1
49 
ValueCountFrequency (%) 
037988.6%
 
14911.4%
 
2020-12-04T12:51:35.261904image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

suv
Boolean

Distinct2
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size3.3 KiB
0
368 
1
60 
ValueCountFrequency (%) 
036886.0%
 
16014.0%
 
2020-12-04T12:51:35.306809image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

wagon
Boolean

Distinct2
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size3.3 KiB
0
398 
1
 
30
ValueCountFrequency (%) 
039893.0%
 
1307.0%
 
2020-12-04T12:51:35.354656image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

minivan
Boolean

Distinct2
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size3.3 KiB
0
408 
1
 
20
ValueCountFrequency (%) 
040895.3%
 
1204.7%
 
2020-12-04T12:51:35.407514image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

pickup
Boolean

Distinct2
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size3.3 KiB
0
404 
1
 
24
ValueCountFrequency (%) 
040494.4%
 
1245.6%
 
2020-12-04T12:51:35.455419image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

awd
Boolean

Distinct2
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size3.3 KiB
0
336 
1
92 
ValueCountFrequency (%) 
033678.5%
 
19221.5%
 
2020-12-04T12:51:35.503258image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

rwd
Boolean

Distinct2
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size3.3 KiB
0
318 
1
110 
ValueCountFrequency (%) 
031874.3%
 
111025.7%
 
2020-12-04T12:51:35.550159image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

retail_price
Real number (ℝ≥0)

HIGH CORRELATION

Distinct410
Distinct (%)95.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean32774.85514
Minimum10280
Maximum192465
Zeros0
Zeros (%)0.0%
Memory size3.3 KiB
2020-12-04T12:51:35.650890image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum10280
5-th percentile13691
Q120334.25
median27635
Q339205
95-th percentile72864.25
Maximum192465
Range182185
Interquartile range (IQR)18870.75

Descriptive statistics

Standard deviation19431.71667
Coefficient of variation (CV)0.5928848988
Kurtosis13.87920552
Mean32774.85514
Median Absolute Deviation (MAD)8314
Skewness2.798099275
Sum14027638
Variance377591612.9
MonotocityNot monotonic
2020-12-04T12:51:35.812431image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
1986020.5%
 
1538920.5%
 
3154520.5%
 
3399520.5%
 
1963520.5%
 
2159520.5%
 
4999520.5%
 
2999520.5%
 
3594020.5%
 
2570020.5%
 
Other values (400)40895.3%
 
ValueCountFrequency (%) 
1028010.2%
 
1053910.2%
 
1076010.2%
 
1099510.2%
 
1115510.2%
 
ValueCountFrequency (%) 
19246510.2%
 
12842010.2%
 
12667010.2%
 
12177010.2%
 
9482010.2%
 

dealer_cost
Real number (ℝ≥0)

HIGH CORRELATION

Distinct425
Distinct (%)99.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean30014.70093
Minimum9875
Maximum173560
Zeros0
Zeros (%)0.0%
Memory size3.3 KiB
2020-12-04T12:51:35.979983image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum9875
5-th percentile12836.65
Q118866
median25294.5
Q335710.25
95-th percentile66471.95
Maximum173560
Range163685
Interquartile range (IQR)16844.25

Descriptive statistics

Standard deviation17642.11775
Coefficient of variation (CV)0.5877825599
Kurtosis13.94616377
Mean30014.70093
Median Absolute Deviation (MAD)7531
Skewness2.834740404
Sum12846292
Variance311244318.7
MonotocityNot monotonic
2020-12-04T12:51:36.139600image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
1963820.5%
 
1420720.5%
 
6830620.5%
 
3788610.2%
 
2492610.2%
 
2119810.2%
 
2388310.2%
 
2490910.2%
 
1365010.2%
 
2491510.2%
 
Other values (415)41597.0%
 
ValueCountFrequency (%) 
987510.2%
 
1010710.2%
 
1014410.2%
 
1031910.2%
 
1064210.2%
 
ValueCountFrequency (%) 
17356010.2%
 
11960010.2%
 
11785410.2%
 
11338810.2%
 
8832410.2%
 

engine_size_(l)
Real number (ℝ≥0)

Distinct43
Distinct (%)10.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.196728972
Minimum1.3
Maximum8.3
Zeros0
Zeros (%)0.0%
Memory size3.3 KiB
2020-12-04T12:51:36.299131image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum1.3
5-th percentile1.7
Q12.375
median3
Q33.9
95-th percentile5.3
Maximum8.3
Range7
Interquartile range (IQR)1.525

Descriptive statistics

Standard deviation1.108594718
Coefficient of variation (CV)0.3467903373
Kurtosis0.5419435378
Mean3.196728972
Median Absolute Deviation (MAD)0.8
Skewness0.7081519825
Sum1368.2
Variance1.22898225
MonotocityNot monotonic
2020-12-04T12:51:36.457738image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=43)
ValueCountFrequency (%) 
3429.8%
 
3.5347.9%
 
2307.0%
 
2.5266.1%
 
2.4235.4%
 
1.8235.4%
 
4.6214.9%
 
4.2204.7%
 
3.2184.2%
 
3.8174.0%
 
Other values (33)17440.7%
 
ValueCountFrequency (%) 
1.320.5%
 
1.410.2%
 
1.561.4%
 
1.6102.3%
 
1.740.9%
 
ValueCountFrequency (%) 
8.310.2%
 
6.810.2%
 
661.4%
 
5.730.7%
 
5.620.5%
 

cyl
Real number (ℝ)

Distinct8
Distinct (%)1.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.775700935
Minimum-1
Maximum12
Zeros0
Zeros (%)0.0%
Memory size3.3 KiB
2020-12-04T12:51:36.593345image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum-1
5-th percentile4
Q14
median6
Q36
95-th percentile8
Maximum12
Range13
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.622779362
Coefficient of variation (CV)0.2809666532
Kurtosis1.396548909
Mean5.775700935
Median Absolute Deviation (MAD)2
Skewness0.2342651493
Sum2472
Variance2.633412856
MonotocityNot monotonic
2020-12-04T12:51:36.713023image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=8)
ValueCountFrequency (%) 
619044.4%
 
413631.8%
 
88720.3%
 
571.6%
 
1230.7%
 
1020.5%
 
-120.5%
 
310.2%
 
ValueCountFrequency (%) 
-120.5%
 
310.2%
 
413631.8%
 
571.6%
 
619044.4%
 
ValueCountFrequency (%) 
1230.7%
 
1020.5%
 
88720.3%
 
619044.4%
 
571.6%
 

hp
Real number (ℝ≥0)

Distinct110
Distinct (%)25.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean215.885514
Minimum73
Maximum500
Zeros0
Zeros (%)0.0%
Memory size3.3 KiB
2020-12-04T12:51:36.856646image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum73
5-th percentile115
Q1165
median210
Q3255
95-th percentile338.25
Maximum500
Range427
Interquartile range (IQR)90

Descriptive statistics

Standard deviation71.83603158
Coefficient of variation (CV)0.3327505873
Kurtosis1.552158629
Mean215.885514
Median Absolute Deviation (MAD)45
Skewness0.9303307363
Sum92399
Variance5160.415434
MonotocityNot monotonic
2020-12-04T12:51:37.015250image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
200174.0%
 
210143.3%
 
215143.3%
 
225133.0%
 
240133.0%
 
220122.8%
 
140122.8%
 
300112.6%
 
170112.6%
 
130102.3%
 
Other values (100)30170.3%
 
ValueCountFrequency (%) 
7310.2%
 
9310.2%
 
10010.2%
 
10351.2%
 
10430.7%
 
ValueCountFrequency (%) 
50010.2%
 
49330.7%
 
47710.2%
 
45010.2%
 
42010.2%
 

city_mpg
Real number (ℝ≥0)

HIGH CORRELATION
MISSING

Distinct28
Distinct (%)6.8%
Missing15
Missing (%)3.5%
Infinite0
Infinite (%)0.0%
Mean20.08958838
Minimum10
Maximum60
Zeros0
Zeros (%)0.0%
Memory size3.3 KiB
2020-12-04T12:51:37.158858image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum10
5-th percentile14
Q117
median19
Q321
95-th percentile29
Maximum60
Range50
Interquartile range (IQR)4

Descriptive statistics

Standard deviation5.219382573
Coefficient of variation (CV)0.2598053516
Kurtosis16.61615357
Mean20.08958838
Median Absolute Deviation (MAD)2
Skewness2.928557209
Sum8297
Variance27.24195444
MonotocityNot monotonic
2020-12-04T12:51:37.295465image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=28)
ValueCountFrequency (%) 
186815.9%
 
205613.1%
 
17409.3%
 
21388.9%
 
19388.9%
 
16296.8%
 
24225.1%
 
26214.9%
 
22184.2%
 
15174.0%
 
Other values (18)6615.4%
 
(Missing)153.5%
 
ValueCountFrequency (%) 
1010.2%
 
1220.5%
 
13112.6%
 
14133.0%
 
15174.0%
 
ValueCountFrequency (%) 
6010.2%
 
5910.2%
 
4610.2%
 
3810.2%
 
3610.2%
 

hwy_mpg
Real number (ℝ≥0)

HIGH CORRELATION
MISSING

Distinct32
Distinct (%)7.7%
Missing15
Missing (%)3.5%
Infinite0
Infinite (%)0.0%
Mean26.90556901
Minimum12
Maximum66
Zeros0
Zeros (%)0.0%
Memory size3.3 KiB
2020-12-04T12:51:37.427114image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum12
5-th percentile18
Q124
median26
Q329
95-th percentile36
Maximum66
Range54
Interquartile range (IQR)5

Descriptive statistics

Standard deviation5.70371136
Coefficient of variation (CV)0.2119899921
Kurtosis6.425357238
Mean26.90556901
Median Absolute Deviation (MAD)3
Skewness1.350295982
Sum11112
Variance32.53232328
MonotocityNot monotonic
2020-12-04T12:51:37.575717image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=32)
ValueCountFrequency (%) 
265412.6%
 
254310.0%
 
28388.9%
 
29347.9%
 
27276.3%
 
24266.1%
 
30235.4%
 
23163.7%
 
21163.7%
 
19153.5%
 
Other values (22)12128.3%
 
(Missing)153.5%
 
ValueCountFrequency (%) 
1210.2%
 
1410.2%
 
1620.5%
 
1792.1%
 
1892.1%
 
ValueCountFrequency (%) 
6610.2%
 
5120.5%
 
4610.2%
 
4410.2%
 
4320.5%
 

weight
Real number (ℝ≥0)

Distinct347
Distinct (%)81.5%
Missing2
Missing (%)0.5%
Infinite0
Infinite (%)0.0%
Mean3577.213615
Minimum1850
Maximum7190
Zeros0
Zeros (%)0.0%
Memory size3.3 KiB
2020-12-04T12:51:37.723321image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum1850
5-th percentile2513
Q13102
median3474.5
Q33974.25
95-th percentile4996.75
Maximum7190
Range5340
Interquartile range (IQR)872.25

Descriptive statistics

Standard deviation760.4376628
Coefficient of variation (CV)0.2125782088
Kurtosis1.678289561
Mean3577.213615
Median Absolute Deviation (MAD)428
Skewness0.8933847105
Sum1523893
Variance578265.439
MonotocityNot monotonic
2020-12-04T12:51:37.885887image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
328540.9%
 
317540.9%
 
347030.7%
 
402430.7%
 
321730.7%
 
267630.7%
 
405230.7%
 
269230.7%
 
343030.7%
 
342830.7%
 
Other values (337)39492.1%
 
ValueCountFrequency (%) 
185010.2%
 
203510.2%
 
205510.2%
 
208510.2%
 
219510.2%
 
ValueCountFrequency (%) 
719010.2%
 
640010.2%
 
613310.2%
 
596910.2%
 
587910.2%
 

wheel_base
Real number (ℝ≥0)

Distinct40
Distinct (%)9.4%
Missing2
Missing (%)0.5%
Infinite0
Infinite (%)0.0%
Mean108.1737089
Minimum89
Maximum144
Zeros0
Zeros (%)0.0%
Memory size3.3 KiB
2020-12-04T12:51:38.038507image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum89
5-th percentile95.25
Q1103
median107
Q3112
95-th percentile123
Maximum144
Range55
Interquartile range (IQR)9

Descriptive statistics

Standard deviation8.326449076
Coefficient of variation (CV)0.07697294619
Kurtosis2.112464038
Mean108.1737089
Median Absolute Deviation (MAD)5
Skewness0.9552742051
Sum46082
Variance69.32975421
MonotocityNot monotonic
2020-12-04T12:51:38.175142image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=40)
ValueCountFrequency (%) 
1074510.5%
 
103307.0%
 
106276.3%
 
112255.8%
 
104225.1%
 
105214.9%
 
115204.7%
 
111174.0%
 
109174.0%
 
101163.7%
 
Other values (30)18643.5%
 
ValueCountFrequency (%) 
8920.5%
 
9392.1%
 
95112.6%
 
9651.2%
 
9730.7%
 
ValueCountFrequency (%) 
14420.5%
 
14010.2%
 
13710.2%
 
13320.5%
 
13110.2%
 

len
Real number (ℝ≥0)

MISSING

Distinct61
Distinct (%)15.2%
Missing26
Missing (%)6.1%
Infinite0
Infinite (%)0.0%
Mean185.1268657
Minimum143
Maximum227
Zeros0
Zeros (%)0.0%
Memory size3.3 KiB
2020-12-04T12:51:38.330732image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum143
5-th percentile162.05
Q1177
median186
Q3193
95-th percentile207
Maximum227
Range84
Interquartile range (IQR)16

Descriptive statistics

Standard deviation13.31252292
Coefficient of variation (CV)0.07191027013
Kurtosis0.3112242615
Mean185.1268657
Median Absolute Deviation (MAD)8
Skewness-0.09622117289
Sum74421
Variance177.2232665
MonotocityNot monotonic
2020-12-04T12:51:38.492283image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
178266.1%
 
190225.1%
 
187174.0%
 
192143.3%
 
200133.0%
 
177133.0%
 
188133.0%
 
179133.0%
 
175122.8%
 
183122.8%
 
Other values (51)24757.7%
 
(Missing)266.1%
 
ValueCountFrequency (%) 
14310.2%
 
14410.2%
 
15010.2%
 
15320.5%
 
15410.2%
 
ValueCountFrequency (%) 
22710.2%
 
22110.2%
 
21920.5%
 
21520.5%
 
21271.6%
 

width
Real number (ℝ≥0)

MISSING

Distinct18
Distinct (%)4.5%
Missing28
Missing (%)6.5%
Infinite0
Infinite (%)0.0%
Mean71.2925
Minimum64
Maximum81
Zeros0
Zeros (%)0.0%
Memory size3.3 KiB
2020-12-04T12:51:38.876242image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum64
5-th percentile67
Q169
median71
Q373
95-th percentile78
Maximum81
Range17
Interquartile range (IQR)4

Descriptive statistics

Standard deviation3.393483915
Coefficient of variation (CV)0.04759945177
Kurtosis-0.2123582009
Mean71.2925
Median Absolute Deviation (MAD)2
Skewness0.5607116725
Sum28517
Variance11.51573308
MonotocityNot monotonic
2020-12-04T12:51:38.996916image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=18)
ValueCountFrequency (%) 
725512.9%
 
684711.0%
 
69429.8%
 
73429.8%
 
70419.6%
 
71368.4%
 
67368.4%
 
74214.9%
 
75174.0%
 
78163.7%
 
Other values (8)4711.0%
 
(Missing)286.5%
 
ValueCountFrequency (%) 
6410.2%
 
6530.7%
 
66102.3%
 
67368.4%
 
684711.0%
 
ValueCountFrequency (%) 
8110.2%
 
8020.5%
 
79122.8%
 
78163.7%
 
7761.4%
 

Interactions

2020-12-04T12:51:17.068689image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:51:17.214301image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:51:17.356919image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:51:17.491586image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:51:17.625235image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:51:17.756882image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:51:17.894481image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:51:18.032146image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:51:18.167750image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:51:18.303421image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:51:18.447006image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:51:18.580647image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:51:18.721130image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:51:18.856766image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:51:18.990441image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:51:19.133054image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:51:19.263678image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:51:19.402308image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:51:19.542931image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:51:19.671623image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:51:19.805231image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:51:19.949875image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:51:20.092463image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:51:20.223111image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:51:20.352800image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:51:20.481423image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:51:20.606090image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:51:20.779652image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:51:20.898335image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:51:21.017989image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:51:21.138664image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:51:21.268320image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:51:21.398971image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:51:21.526656image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:51:21.657280image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:51:21.789925image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:51:21.906632image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:51:22.024307image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:51:22.141981image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:51:22.265652image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:51:22.387358image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:51:22.505044image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:51:22.624725image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:51:22.755341image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:51:22.877016image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:51:23.009662image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:51:23.138318image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:51:23.259028image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:51:23.404606image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:51:23.519299image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:51:23.635989image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:51:23.750680image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:51:23.864376image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:51:24.000014image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:51:24.129695image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:51:24.827834image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:51:24.956491image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:51:25.088105image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:51:25.240735image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:51:25.356387image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:51:25.471080image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:51:25.592789image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:51:25.717427image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:51:25.838127image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:51:25.971744image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:51:26.112366image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:51:26.246010image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:51:26.386634image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:51:26.517286image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:51:26.636963image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:51:26.759670image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:51:26.886324image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:51:27.012011image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:51:27.140617image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:51:27.296201image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:51:27.419903image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:51:27.560529image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:51:27.697158image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:51:27.832799image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:51:27.960426image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:51:28.077147image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:51:28.190844image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:51:28.310522image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:51:28.431194image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:51:28.550848image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:51:28.667536image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:51:28.784250image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:51:28.909915image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:51:29.045526image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:51:29.177171image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:51:29.305830image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:51:29.424537image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:51:29.542229image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:51:29.658911image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:51:29.776569image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:51:29.895252image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:51:30.015929image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:51:30.135609image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:51:30.261302image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:51:30.385967image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:51:30.529583image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:51:30.675168image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:51:30.809836image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:51:30.938495image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:51:31.070138image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:51:31.203752image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:51:31.343409image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:51:31.477058image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:51:31.608703image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:51:31.751322image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:51:31.891912image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:51:32.041547image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:51:32.367668image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:51:32.496299image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:51:32.619994image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:51:32.746630image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:51:32.875284image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:51:33.003968image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:51:33.129638image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:51:33.264246image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:51:33.405907image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Correlations

2020-12-04T12:51:39.152527image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2020-12-04T12:51:39.454694image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2020-12-04T12:51:39.756925image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2020-12-04T12:51:40.058079image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2020-12-04T12:51:33.722055image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:51:34.147909image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:51:34.387241image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:51:34.571748image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Sample

First rows

vehicle_namesmallsporty_compactlarge_sedansports_carsuvwagonminivanpickupawdrwdretail_pricedealer_costengine_size_(l)cylhpcity_mpghwy_mpgweightwheel_baselenwidth
0Acura 3.5 RL 4dr1000000043755390143.5622518.024.03880.0115.0197.072.0
1Acura 3.5 RL w/Navigation 4dr1000000046100411003.5622518.024.03893.0115.0197.072.0
2Acura MDX0010001036945333373.5626517.023.04451.0106.0189.077.0
3Acura NSX coupe 2dr manual S0100000189765799783.2629017.024.03153.0100.0174.071.0
4Acura RSX Type S 2dr1000000023820217612.0420024.031.02778.0101.0172.068.0
5Acura TL 4dr1000000033195302993.2627020.028.03575.0108.0186.072.0
6Acura TSX 4dr1000000026990246472.4420022.029.03230.0105.0183.069.0
7Audi A4 1.8T 4dr1000000025940235081.8417022.031.03252.0104.0179.070.0
8Audi A4 3.0 4dr1000000031840288463.0622020.028.03462.0104.0179.070.0
9Audi A4 3.0 convertible 2dr1000000042490383253.0622020.027.03814.0105.0180.070.0

Last rows

vehicle_namesmallsporty_compactlarge_sedansports_carsuvwagonminivanpickupawdrwdretail_pricedealer_costengine_size_(l)cylhpcity_mpghwy_mpgweightwheel_baselenwidth
418Volvo S40 4dr1000000025135237011.9417022.029.02767.0101.0178.068.0
419Volvo S60 2.5 4dr1000001031745299162.5520820.027.03903.0107.0180.071.0
420Volvo S60 R 4dr1000001037560353822.5530018.025.03571.0107.0181.071.0
421Volvo S60 T5 4dr1000000034845329022.3524720.028.03766.0107.0180.071.0
422Volvo S80 2.5T 4dr1000001037885356882.5519420.027.03691.0110.0190.072.0
423Volvo S80 2.9 4dr1000000037730355422.9620820.028.03576.0110.0190.072.0
424Volvo S80 T6 4dr1000000045210425732.9626819.026.03653.0110.0190.072.0
425Volvo V400001000026135246411.9417022.029.02822.0101.0180.068.0
426Volvo XC700001001035145331122.55208NaNNaN3823.0109.0186.073.0
427Volvo XC90 T60010001041250388512.9626815.020.04638.0113.0189.075.0